Overview

Now we will use all of the skills we have learned in this workshop in a real world case study. This example walks through using R and publicly available datasets to examine how drought is influencing wind erosion in the Chihuahuan Desert. Our goal is to reproduce Figure 4 from McCord et al. 2023 (https://doi.org/10.1002/ael2.20120). First, we will query the Landscape Data Commons for data that fall within the Chihuahuan Desert ecoregion. Then, we will use these data to set a benchmark of expected wind erosion risk in the Chihuahuan Desert. With this benchmark in place, we will apply it to the monitoring data in the Landscape Data Commons and compare those results at multiple spatial scales. Finally, we will incorporate drought and remote-sensing based fractional cover data, to provide additional context to the field based monitoring data.

Set your working directory make sure it ends in Part 2. If you tab over with a “” Rstudio will help you set it.

getwd() # check your current working drive
## [1] "C:/Users/gharrison/OneDrive - USDA/Documents/R scripts/RinRange_workshop_srm24/RinRangelands-main_1.31.24/RinRangelands-main/Part2"
# setwd("") # change this if needed

Library the packages we will use in this demo

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(trex) # for fulling data on the Landscape Data Commons
library(sf) # for processing spatial data
## Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE
library(httr2) # for Climate Engine API requests
library(mapview) # for visualizing spatial data
## Warning: package 'mapview' was built under R version 4.3.2
library(ggpubr) # for combining multiple ggplot objects
library(maps)
## Warning: package 'maps' was built under R version 4.3.2
## 
## Attaching package: 'maps'
## 
## The following object is masked from 'package:purrr':
## 
##     map

1. Prepare Data

Now we need to pull in some polygons for our region of interest so that we can set a spatial boundaries on our data.

# Bring in the polygons for our region of interest polygons
chihuahuan_desert <- sf::read_sf("MapLayers/ChihuahuanDesert.shp")
sandy_esg <- sf::read_sf("MapLayers/SandyESG.shp")

# extract data from LDC based on the chihuahuan_desert polygon
data <- fetch_ldc_spatial(chihuahuan_desert, data_type = "indicators")
## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries
head(data)
##     rid                 PrimaryKey
## 1 50790 06080810205047152005-07-19
## 2 50791 06080810212461432005-07-19
## 3 50792 06080810220026862005-07-19
## 4 50793 06080810224129002005-07-20
## 5 50794 06051514053759212006-04-19
## 6 50795 06051514053759212007-03-09
##                                                   DBKey ProjectKey
## 1 MURV_2003topresent_DIMA5pt5_11May2021ANON_DatesEdited       MURV
## 2 MURV_2003topresent_DIMA5pt5_11May2021ANON_DatesEdited       MURV
## 3 MURV_2003topresent_DIMA5pt5_11May2021ANON_DatesEdited       MURV
## 4 MURV_2003topresent_DIMA5pt5_11May2021ANON_DatesEdited       MURV
## 5 MURV_2003topresent_DIMA5pt5_11May2021ANON_DatesEdited       MURV
## 6 MURV_2003topresent_DIMA5pt5_11May2021ANON_DatesEdited       MURV
##                DateVisited EcologicalSiteId PercentCoveredByEcoSite
## 1 2005-07-19T00:00:00.000Z      R042XB010NM                      NA
## 2 2005-07-19T00:00:00.000Z      R042XB035NM                      NA
## 3 2005-07-19T00:00:00.000Z      R042XB010NM                      NA
## 4 2005-07-20T00:00:00.000Z      R042XB027NM                      NA
## 5 2006-04-19T00:00:00.000Z      R042XB021NM                      NA
## 6 2007-03-09T00:00:00.000Z      R042XB021NM                      NA
##   Latitude_NAD83 Longitude_NAD83 LocationStatus LocationType
## 1        33.0389        -107.336             NA           NA
## 2        33.0385        -107.337             NA           NA
## 3        33.0386        -107.336             NA           NA
## 4        32.9359         -107.26             NA           NA
## 5         32.563        -106.521             NA           NA
## 6         32.563        -106.521             NA           NA
##   Latitude_NAD83_Actual Longitude_NAD83_Actual BareSoilCover TotalFoliarCover
## 1                    NA                     NA          8.50            33.00
## 2                    NA                     NA          3.00            32.00
## 3                    NA                     NA          7.50            17.00
## 4                    NA                     NA          7.00            57.00
## 5                    NA                     NA          3.75            48.75
## 6                    NA                     NA            NA               NA
##   AH_AnnGrassCover AH_ForbCover AH_GrassCover AH_PerenForbCover
## 1              0.0         23.0           7.5               5.0
## 2              0.0         15.5          12.5               4.5
## 3              0.0         12.0           2.0               7.0
## 4             14.5          8.0          29.5               2.5
## 5              0.0          7.5           2.5               0.0
## 6               NA           NA            NA                NA
##   AH_PerenForbGrassCover AH_PerenGrassCover AH_ShrubCover FH_CyanobacteriaCover
## 1                   12.5                7.5             5                     0
## 2                   16.0               12.5             8                     0
## 3                    8.0                2.0             5                     0
## 4                   19.0               16.5            17                     0
## 5                    2.5                2.5             0                     0
## 6                     NA                 NA            NA                     0
##   FH_DepSoilCover FH_DuffCover FH_EmbLitterCover FH_HerbLitterCover
## 1               0            0                 0               10.5
## 2               0            0                 0                8.5
## 3               0            0                 0                8.0
## 4               0            0                 0                7.5
## 5               0            0                 0               17.5
## 6               0            0                 0                 NA
##   FH_LichenCover FH_MossCover FH_RockCover FH_TotalLitterCover
## 1              0            0        48.00               10.50
## 2              0            0        56.50                8.50
## 3              0            0        67.50                8.00
## 4              0            0        28.50                7.50
## 5              0            0        28.75               18.75
## 6             NA            0           NA                  NA
##   FH_VagrLichenCover FH_WaterCover FH_WoodyLitterCover GapCover_101_200
## 1                  0             0                0.00               NA
## 2                  0             0                0.00               NA
## 3                  0             0                0.00               NA
## 4                  0             0                0.00               NA
## 5                  0             0                1.25               NA
## 6                  0             0                  NA               NA
##   GapCover_200_plus GapCover_25_50 GapCover_25_plus GapCover_51_100
## 1                NA             NA               NA              NA
## 2                NA             NA               NA              NA
## 3                NA             NA               NA              NA
## 4                NA             NA               NA              NA
## 5                NA             NA               NA              NA
## 6                NA             NA               NA              NA
##   Hgt_Forb_Avg Hgt_Grass_Avg Hgt_Herbaceous_Avg Hgt_PerenForb_Avg
## 1           NA            NA                 NA                NA
## 2           NA            NA                 NA                NA
## 3           NA            NA                 NA                NA
## 4           NA            NA                 NA                NA
## 5           NA            NA                 NA                NA
## 6           NA            NA                 NA                NA
##   Hgt_PerenForbGrass_Avg Hgt_PerenGrass_Avg Hgt_Woody_Avg RH_AnnualProd
## 1                     NA                 NA            NA          <NA>
## 2                     NA                 NA            NA          <NA>
## 3                     NA                 NA            NA          <NA>
## 4                     NA                 NA            NA          <NA>
## 5                     NA                 NA            NA          <NA>
## 6                     NA                 NA            NA          <NA>
##   RH_BareGround RH_BioticIntegrity RH_CommentsBI RH_CommentsHF RH_CommentsSS
## 1          <NA>               <NA>            NA            NA            NA
## 2          <NA>               <NA>            NA            NA            NA
## 3          <NA>               <NA>            NA            NA            NA
## 4          <NA>               <NA>            NA            NA            NA
## 5          <NA>               <NA>            NA            NA            NA
## 6          <NA>               <NA>            NA            NA            NA
##   RH_Compaction RH_DeadDyingPlantParts RH_FuncSructGroup RH_Gullies
## 1          <NA>                   <NA>              <NA>       <NA>
## 2          <NA>                   <NA>              <NA>       <NA>
## 3          <NA>                   <NA>              <NA>       <NA>
## 4          <NA>                   <NA>              <NA>       <NA>
## 5          <NA>                   <NA>              <NA>       <NA>
## 6          <NA>                   <NA>              <NA>       <NA>
##   RH_HydrologicFunction RH_InvasivePlants RH_LitterAmount RH_LitterMovement
## 1                  <NA>              <NA>            <NA>              <NA>
## 2                  <NA>              <NA>            <NA>              <NA>
## 3                  <NA>              <NA>            <NA>              <NA>
## 4                  <NA>              <NA>            <NA>              <NA>
## 5                  <NA>              <NA>            <NA>              <NA>
## 6                  <NA>              <NA>            <NA>              <NA>
##   RH_PedestalsTerracettes RH_PlantCommunityComp RH_ReprodCapabilityPeren
## 1                    <NA>                  <NA>                     <NA>
## 2                    <NA>                  <NA>                     <NA>
## 3                    <NA>                  <NA>                     <NA>
## 4                    <NA>                  <NA>                     <NA>
## 5                    <NA>                  <NA>                     <NA>
## 6                    <NA>                  <NA>                     <NA>
##   RH_Rills RH_SoilSiteStability RH_SoilSurfLossDeg RH_SoilSurfResisErosion
## 1     <NA>                 <NA>               <NA>                    <NA>
## 2     <NA>                 <NA>               <NA>                    <NA>
## 3     <NA>                 <NA>               <NA>                    <NA>
## 4     <NA>                 <NA>               <NA>                    <NA>
## 5     <NA>                 <NA>               <NA>                    <NA>
## 6     <NA>                 <NA>               <NA>                    <NA>
##   RH_WaterFlowPatterns RH_WindScouredAreas SoilStability_All
## 1                 <NA>                <NA>                NA
## 2                 <NA>                <NA>                NA
## 3                 <NA>                <NA>                NA
## 4                 <NA>                <NA>                NA
## 5                 <NA>                <NA>                NA
## 6                 <NA>                <NA>                NA
##   SoilStability_Protected SoilStability_Unprotected           DateLoadedInDb
## 1                      NA                        NA 2023-04-10T00:00:00.000Z
## 2                      NA                        NA 2023-04-10T00:00:00.000Z
## 3                      NA                        NA 2023-04-10T00:00:00.000Z
## 4                      NA                        NA 2023-04-10T00:00:00.000Z
## 5                      NA                        NA 2023-04-10T00:00:00.000Z
## 6                      NA                        NA 2023-04-10T00:00:00.000Z
##                                         mlra_name mlrarsym
## 1 Southern Desertic Basins, Plains, and Mountains       42
## 2 Southern Desertic Basins, Plains, and Mountains       42
## 3 Southern Desertic Basins, Plains, and Mountains       42
## 4 Southern Desertic Basins, Plains, and Mountains       42
## 5 Southern Desertic Basins, Plains, and Mountains       42
## 6 Southern Desertic Basins, Plains, and Mountains       42
##                na_l1name    na_l2name          us_l3name
## 1 NORTH AMERICAN DESERTS WARM DESERTS Chihuahuan Deserts
## 2 NORTH AMERICAN DESERTS WARM DESERTS Chihuahuan Deserts
## 3 NORTH AMERICAN DESERTS WARM DESERTS Chihuahuan Deserts
## 4 NORTH AMERICAN DESERTS WARM DESERTS Chihuahuan Deserts
## 5 NORTH AMERICAN DESERTS WARM DESERTS Chihuahuan Deserts
## 6 NORTH AMERICAN DESERTS WARM DESERTS Chihuahuan Deserts
##                      us_l4name State   modis_landcover horizontal_flux_total_MD
## 1 Chihuahuan Basins and Playas    NM Closed Shrublands                       NA
## 2 Chihuahuan Basins and Playas    NM Closed Shrublands                       NA
## 3 Chihuahuan Basins and Playas    NM Closed Shrublands                       NA
## 4    Low Mountains and Bajadas    NM   Open Shrublands                       NA
## 5 Chihuahuan Montane Woodlands    NM        Grasslands                       NA
## 6 Chihuahuan Montane Woodlands    NM        Grasslands                       NA
##   vertical_flux_MD PM2_5_MD PM10_MD
## 1               NA       NA      NA
## 2               NA       NA      NA
## 3               NA       NA      NA
## 4               NA       NA      NA
## 5               NA       NA      NA
## 6               NA       NA      NA

Lets check in with the study area and make a map of our regions of interest.

First lets make a map of the region of the boundaries of the chihuahuan desert within the US. To do this, we need to pull in a US state boundary.

# pull in US state boundaries
usa <- st_as_sf(maps::map("state", fill=TRUE, plot =FALSE))

# Generate a map from just the state outlines
state_base_map <- ggplot(usa) +
  geom_sf(color = "#2b2b2b", fill = "white", size=0.125)
state_base_map

Good start, lets add the Chihuahuan desert study area on this map. Remember ggplot figures work like layers, so we can add the desert on top of our base map

# lets add in the boundary for the Chihuahan desert 
# add this on to our prior map
state_map <- state_base_map +
  geom_sf(data = chihuahuan_desert, aes(fill = "Chihuahuan Desert"))+ 
  theme_minimal()
state_map

Nice. Now let’s use some customization within ggplot to adjust legend position and change colors

us_state_map <- state_base_map +
  geom_sf(data = chihuahuan_desert, aes(fill = "Chihuahuan Desert")) + # alpha makes the shape transparent
  scale_fill_manual(values = "goldenrod1", guide = guide_legend(title = "Legend")) + 
  theme(legend.position = "top")
us_state_map

Awesome, now we have a starting map. Lets move on to working with the data.

Take a look at the LDC data and clean it up

# Make the DateVisited formatted as date
data <- data |> dplyr::mutate(DateVisited = lubridate::as_date(DateVisited),
                                    Year = lubridate::year(DateVisited) |> as.factor())
# Make sure the Lat/Long coordinates are numeric
data <- data |> 
  dplyr::mutate(Longitude_NAD83 = as.numeric(Longitude_NAD83),
                Latitude_NAD83 = as.numeric(Latitude_NAD83)) 

2. Establish benchmark

We want to understand which sites might be susceptible to wind erosion. To do this, we took a subset (~600) of the points in the ecoregion and compared them with wind erosion models and determined that decreases in total foliar cover below 30% are associated with an increase in wind erosion risk. However, we do not want to apply the benchmark to the points used to develop the benchmark. So we will first remove the benchmark generation points. Then we will apply the 30% benchmark to these points.

pull in the benchmark development data

benchmark <- read.csv("benchmark_points.csv")
head(benchmark)
##                   PrimaryKey SpeciesState      PlotID          PlotKey
## 1  1611141556434902016-09-01           NM  Bottom-002      1.61114E+14
## 2 16111616394616792016-09-01           NM NR-FIRE-TRT 1611161639461679
## 3 16111616402732582016-09-01           NM NR-FIRE-CTL 1611161640273258
## 4 16102714453525922016-09-01           NM   Sandy-463 1610271445352592
## 5 16081110402163562016-09-01           NM   Gsand-144 1608111040216356
## 6 16081114554791692016-09-01           NM   Hills-188 1608111455479169
##                                 DBKey EcologicalSiteId State   County
## 1 NM_LasCrucesDO_LUP_2016_5-3b_01.mdb      R042XB012NM    NM Dona Ana
## 2 NM_LasCrucesDO_LUP_2016_5-3b_01.mdb      R042XB035NM    NM Dona Ana
## 3 NM_LasCrucesDO_LUP_2016_5-3b_01.mdb      R042XB035NM    NM Dona Ana
## 4 NM_LasCrucesDO_LUP_2016_5-3b_01.mdb      R042XB011NM    NM     Luna
## 5 NM_LasCrucesDO_LUP_2016_5-3b_01.mdb      R042XB024NM    NM Dona Ana
## 6 NM_LasCrucesDO_LUP_2016_5-3b_01.mdb      R042XB010NM    NM Dona Ana
##    DateEstablished DateLoadedInDb                   ProjectName DateVisited
## 1  11/6/2016 17:00       9/1/2016 New Mexico Las Cruces FO 2016          NA
## 2  11/8/2016 17:00       9/1/2016 New Mexico Las Cruces FO 2016          NA
## 3  11/8/2016 17:00       9/1/2016 New Mexico Las Cruces FO 2016          NA
## 4 10/24/2016 18:00       9/1/2016 New Mexico Las Cruces FO 2016          NA
## 5  8/10/2016 18:00       9/1/2016 New Mexico Las Cruces FO 2016          NA
## 6  8/10/2016 18:00       9/1/2016 New Mexico Las Cruces FO 2016          NA
##     source LocationType ELEVATION PercentCoveredByEcoSite MLRA_ID MLRARSYM
## 1 TerrADat         <NA>        NA                      NA      43      42B
## 2 TerrADat         <NA>        NA                      NA      43      42B
## 3 TerrADat         <NA>        NA                      NA      43      42B
## 4 TerrADat         <NA>        NA                      NA      43      42B
## 5 TerrADat         <NA>        NA                      NA      43      42B
## 6 TerrADat         <NA>        NA                      NA      43      42B
##                  MLRA_NAME LRRSYM                           LRR_NAME
## 1 Southern Rio Grande Rift      D Western Range and Irrigated Region
## 2 Southern Rio Grande Rift      D Western Range and Irrigated Region
## 3 Southern Rio Grande Rift      D Western Range and Irrigated Region
## 4 Southern Rio Grande Rift      D Western Range and Irrigated Region
## 5 Southern Rio Grande Rift      D Western Range and Irrigated Region
## 6 Southern Rio Grande Rift      D Western Range and Irrigated Region

3. Apply benchmark

Start by removing the training benchmark data

# remove data that have been used in benchmarking
nrow(data)
## [1] 2294
data <- data |> subset(!PrimaryKey %in% benchmark$PrimaryKey)

# check total number of plots
nrow(data)
## [1] 1709

Score score the plots based on if they are meeting or not meeting benchmarks

# Apply the benchmark of 30% to the Total Foliar Cover values
data <- data |> 
  dplyr::mutate(evaluated = dplyr::case_when(
    TotalFoliarCover >30 ~ "Meeting",
    TotalFoliarCover <=30 ~ "Not meeting")) |>
  dplyr::filter(!is.na(TotalFoliarCover))
# this will create a new column called evaluated with values of meeting or not meeting based on the values in the Total Foliar Cover col
# there were some plots with NA in the TotalFoliarCover - we want to remove those since we cannot evaluate them for this benchmark

Convert these data to spatial objects so we can make maps and conduct spatial analyses To accomplish this, we will need specify a CRS (Coordinate reference system) [https://rspatial.org/raster/spatial/6-crs.html]

# move from a st to sf object
# users must specify the columns which are coordinates, and the Coordinate reference system
data_sf <- sf::st_as_sf(data %>% subset(!is.na(Longitude_NAD83)),
                     coords = c("Longitude_NAD83", "Latitude_NAD83"),
                     crs = st_crs(4269)) # 4268 is the CRS code for NAD 83


# save these data as a shapefile
sf::st_write(data_sf, "ldc_data.shp", append=F) 
## Warning in abbreviate_shapefile_names(obj): Field names abbreviated for ESRI
## Shapefile driver
## Deleting layer `ldc_data' using driver `ESRI Shapefile'
## Writing layer `ldc_data' to data source `ldc_data.shp' using driver `ESRI Shapefile'
## Writing 1502 features with 85 fields and geometry type Point.

We can add points to our base map with states

# recall study area map
us_state_map

This map is great for providing context of our study are in the US…. but to see the plot points on a map we are going to need to zoom into New Mexico and AZ. To do that, we can filter the ‘usa’ data frame

head(usa) # seems like the state name is in the ID column 
## Simple feature collection with 6 features and 1 field
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -124.3834 ymin: 30.24071 xmax: -71.78015 ymax: 42.04937
## Geodetic CRS:  +proj=longlat +ellps=clrk66 +no_defs +type=crs
##                      ID                           geom
## alabama         alabama MULTIPOLYGON (((-87.46201 3...
## arizona         arizona MULTIPOLYGON (((-114.6374 3...
## arkansas       arkansas MULTIPOLYGON (((-94.05103 3...
## california   california MULTIPOLYGON (((-120.006 42...
## colorado       colorado MULTIPOLYGON (((-102.0552 4...
## connecticut connecticut MULTIPOLYGON (((-73.49902 4...
zoomed_state_map <- ggplot()+
  geom_sf(data=usa %>%  dplyr::filter(ID %in% c("new mexico", "arizona")),
          fill = "white", color = "black")+
  theme_minimal()
zoomed_state_map

Add in the Chihuahan Desert and Sandy ESG map layers to this zoomed in map

zoomed_state_map <- zoomed_state_map+ 
  geom_sf(data = chihuahuan_desert, aes(fill = "Chihuahuan Desert"), alpha = 0.75) + # make semi transparent with alpha 
  geom_sf(data = sandy_esg, aes(fill = "Sandy ESG"), color = "gold4" )+ 
  scale_fill_manual(values = c("Chihuahuan Desert" = "goldenrod1", 
                               "Sandy ESG" = "gold4"),
                    name = "",
                    labels = c("Chihuahuan Desert", "Sandy ESG")) + 
  theme(legend.position = "top")
zoomed_state_map

##### challenge!
# add the plot locations to the study area map consider changing dot size using “size =”

#zoomed_state_map_withPts <- zoomed_state_map + 

Now we can see the plot locations, but I wonder where plots are that are meeting or not meeting the benchmark? We can change the color of each plot location based on the ‘evaluated’ column

# get the values in this column 
summary(as.factor(data_sf$evaluated))
##     Meeting Not meeting 
##         948         554
# Specify colors for meeting and not meeting
meeting_colors <- c("Meeting" = "royalblue1", "Not meeting" = "firebrick4")

# add these points to the zoomed state map, but have the color of each point indicate if the location is meeting or not meeting the benchmark
study_with_bench_eval <- zoomed_state_map +
  geom_sf(data = data_sf,
          size = 0.75, 
          aes(color = evaluated)) +
  scale_color_manual(values = meeting_colors, 
                     name = "Wind Erosion Benchmark")

# Display the map with adjusted point size and color
study_with_bench_eval

Now that we know where these points are, we can make some summary graphs to show the portion of the plots which are meeting this benchmark.

For this, we are interested in understanding how many plots within the Chihuahuan Desert, the Sandy ESG, MLRA 42, and a long-term research site at the Jornada Experimental Range called “NWERN_JER”).

# Identify the membership of each plot within different polygons (Chihuahuan Desert, MLRA 42, SandyESG and a long-term research site at the Jornada Experimental Range called "NWERN_JER")
# We know all the points are within the Chihuahaun desert, do create a col and fill all values with yes
# sort plots by if they occur in MLRA42. If the value in mlrasym col is 42, put 'yes' in the new column called MLRA42
data_sf <- data_sf |> 
  dplyr::mutate(Chihuahuan_desert = "yes",
                MLRA42 = dplyr::case_when(mlrarsym %in% "42"~ "yes"),
                NWERN_JER = dplyr::case_when(ProjectKey %in% "NWERN_JER"~ "yes"))

we also want to know if the points falls within the Sandy ESG. We will use the st_within function

# create a col called SandyESG, where values correspond to if each row is within the SandyESG
# Assuming data_sf and sandy_sf are sf objects

data_sf$SandyESG <- sf::st_within(data_sf, sandy_esg)

hmm seems like there is an issue about mismatching CRS

# figure out the current crs for each data 
print(st_crs(data_sf))
## Coordinate Reference System:
##   User input: EPSG:4269 
##   wkt:
## GEOGCRS["NAD83",
##     DATUM["North American Datum 1983",
##         ELLIPSOID["GRS 1980",6378137,298.257222101,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["geodetic latitude (Lat)",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["geodetic longitude (Lon)",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     USAGE[
##         SCOPE["Geodesy."],
##         AREA["North America - onshore and offshore: Canada - Alberta; British Columbia; Manitoba; New Brunswick; Newfoundland and Labrador; Northwest Territories; Nova Scotia; Nunavut; Ontario; Prince Edward Island; Quebec; Saskatchewan; Yukon. Puerto Rico. United States (USA) - Alabama; Alaska; Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Hawaii; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming. US Virgin Islands. British Virgin Islands."],
##         BBOX[14.92,167.65,86.45,-40.73]],
##     ID["EPSG",4269]]
print(st_crs(sandy_esg))
## Coordinate Reference System:
##   User input: WGS 84 / Pseudo-Mercator 
##   wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
##     BASEGEOGCRS["WGS 84",
##         ENSEMBLE["World Geodetic System 1984 ensemble",
##             MEMBER["World Geodetic System 1984 (Transit)"],
##             MEMBER["World Geodetic System 1984 (G730)"],
##             MEMBER["World Geodetic System 1984 (G873)"],
##             MEMBER["World Geodetic System 1984 (G1150)"],
##             MEMBER["World Geodetic System 1984 (G1674)"],
##             MEMBER["World Geodetic System 1984 (G1762)"],
##             MEMBER["World Geodetic System 1984 (G2139)"],
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]],
##             ENSEMBLEACCURACY[2.0]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["Popular Visualisation Pseudo-Mercator",
##         METHOD["Popular Visualisation Pseudo Mercator",
##             ID["EPSG",1024]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["False easting",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Web mapping and visualisation."],
##         AREA["World between 85.06°S and 85.06°N."],
##         BBOX[-85.06,-180,85.06,180]],
##     ID["EPSG",3857]]

Ah yes the crs for sandy_esg is WGS 84. Let’s change that.

# Transform sandy_esg_transformed to have the same CRS as nm_data
sandy_esg_transformed <-  sf::st_make_valid(sf::st_transform(sandy_esg, st_crs(data_sf)))

# try again to assign values based on points, but use the transformed data
data_sf$SandyESG <- sf::st_within(data_sf, sandy_esg_transformed)

head(summary(data_sf$SandyESG), 20) # the values in this col are either 0 or 1
##       Length Class  Mode   
##  [1,] 0      -none- numeric
##  [2,] 0      -none- numeric
##  [3,] 0      -none- numeric
##  [4,] 0      -none- numeric
##  [5,] 0      -none- numeric
##  [6,] 0      -none- numeric
##  [7,] 0      -none- numeric
##  [8,] 0      -none- numeric
##  [9,] 0      -none- numeric
## [10,] 0      -none- numeric
## [11,] 0      -none- numeric
## [12,] 0      -none- numeric
## [13,] 0      -none- numeric
## [14,] 0      -none- numeric
## [15,] 0      -none- numeric
## [16,] 0      -none- numeric
## [17,] 0      -none- numeric
## [18,] 1      -none- numeric
## [19,] 0      -none- numeric
## [20,] 0      -none- numeric
# Convert the logical values to "Yes" or "No" if needed
data_sf$SandyESG <- ifelse(data_sf$SandyESG, "yes", "No")
head(summary(data_sf$SandyESG), 20)
##    Length     Class      Mode 
##      1502 character character

Great. Now for each plot (row), we know if the plot occurs within the Chihuahan desert, Sandy ESG, MLRA 42, and in NWERN

To prepare for plotting, we need to transform data from wide to long format using a pivot.

# To faciliate plotting, it would be helpful if the data were long rather than wide.
data_long <- as.data.frame(data_sf) |>
  tidyr::pivot_longer(cols = c("MLRA42",
                               "SandyESG", 
                               "NWERN_JER", 
                               "Chihuahuan_desert"),
                      names_to = "region",
                      values_to = "region_membership") |>
  subset(region_membership =="yes")|>
  dplyr::mutate(region = factor(region)|>
                  dplyr::recode( "NWERN_JER" = "Site",
                                 "SandyESG" = "ESG", 
                                 "MLRA42" = "MLRA",
                                 "Chihuahuan_desert" = "Ecoregion"))

# We also need region to be a factor for easier plot formatting
data_long$region <- factor(data_long$region,
                              levels = c( "Site", 
                                          "ESG",
                                          "MLRA",
                                          "Ecoregion")
                              )

Plot the results

and complete a challenge.

We subset the data to 2015-2022 because that is when we have a sufficient sample size

# box plot
benchmark_boxplot <- 
  ggplot(data_long |> 
           subset(Year %in% c("2015", "2016", "2017", "2018", "2019", "2020", "2021", "2022") &
                                TotalFoliarCover),
       aes(x = Year,
           y = TotalFoliarCover,
           group = Year)) +
  facet_grid(rows = "region")+
  geom_boxplot()+
  geom_point(aes(color = evaluated))+
  scale_color_manual(values = c("darkblue", "orange"))+
  theme_bw()+
  ylab("Total foliar cover (%)")+
  guides(color=guide_legend(title="Benchmark status"))
benchmark_boxplot

CHALLENGE

this looks good.. but the points are covering up the boxes. To make this easier to see, make the points semi transparents and wiggle (jitter) the points around

#benchmark_boxplot_jittered <- 

bar chart

benchmark_barchart <- 
  ggplot(data_long |> 
           subset(Year %in% c("2015", "2016", "2017", "2018", "2019", "2020", "2021", "2022") &
                                TotalFoliarCover),
       aes(x = Year,
           y = TotalFoliarCover,
           fill = evaluated)) +
  facet_grid(rows = "region")+
  geom_bar(position = "fill", stat = "identity")+
  scale_fill_manual(values = c("darkblue", "orange"))+
  theme_bw()+
  ylab("Proportion of plots")+
  guides(fill=guide_legend(title="Benchmark status"))
benchmark_barchart

summarize these data in a table

data_long_summary <- 
  data_long |> 
  subset(Year %in% c("2015", "2016", "2017", "2018", "2019", "2020", "2021", "2022") &
           TotalFoliarCover) |>
  dplyr::select(PrimaryKey, region, Year) |>
  dplyr::mutate(Year = as.character(Year)) |>
  dplyr::distinct()|>
  dplyr::group_by(Year, region) |> 
  dplyr::tally()

4. Climate Data/RAP

Authenticate to API and run preliminary test

Define Climate Engine API key and run a simple test endpoint to make sure that the key authenticated correctly. The Climate Engine API documentation is published at https://docs.climateengine.org

There are also Climate Engine API tutorials in R and Python here: https://support.climateengine.org/article/42-api-tutorials

The Climate Engine API has a ‘Swagger’ app for testing and making requests at: https://api.climateengine.org/docs

NOTE: The key below will remain active through February 4, please request a free key to run this in the future: https://docs.climateengine.org/docs/build/html/registration.html

# Define root url for Climate Engine API
root_url <- 'https://api.climateengine.org/'

# Define key
key <- 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJmcmVzaCI6ZmFsc2UsImlhdCI6MTcwNTg3ODY2NiwianRpIjoiYmEyYTZmMzMtNjdkOS00ODAzLThiOTQtYWQ0Yjg5NmZjZDJkIiwibmJmIjoxNzA1ODc4NjY2LCJ0eXBlIjoiYWNjZXNzIiwic3ViIjoidUZXM0VmMUxhWmI5U0VhUnRPdzVNUzI2UjBTMiIsImV4cCI6MTcwNzE3NDY2Niwicm9sZXMiOiJ1c2VyIiwidXNlcl9pZCI6InVGVzNFZjFMYVpiOVNFYVJ0T3c1TVMyNlIwUzIifQ.WIQZIn97BpFZorL_n6NGvEi_MHg2qMnIUh9Rm6aOcl0'

# Define endpoint
endpoint <- '/home/validate_key'

# Run simple endpoint to get key expiration
test <- request(base_url = paste0(root_url, endpoint)) |>
  req_headers(Authorization = key) |>
  req_perform()

print(resp_raw(test))
## HTTP/1.1 200 OK
## content-type: application/json
## X-Cloud-Trace-Context: 56fbd20bd5568e2528627f452c2d28bc;o=1
## Date: Wed, 31 Jan 2024 17:32:34 GMT
## Server: Google Frontend
## Content-Length: 35
## Via: 1.1 google
## Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
## 
## {"ClimateEngine":"Your key works!"}
## <httr2_response>
## GET https://api.climateengine.org//home/validate_key
## Status: 200 OK
## Content-Type: application/json
## Body: In memory (35 bytes)
# Clean up unneeded objects
rm(test, endpoint)

Import the wind erosion area of interest

This section of the notebook runs using the Sandy Ecological Site Group, but you can run similar analysis for any shapefile. We will import the shapefile as an sf object and upload it to Google Earth Engine and then visualize the Sandy ESG on a leaflet map using the mapview package.

# Print the number of vertices in shapefile
# Because of the complexity of the shapefile, for this demo I uploaded it to Google Earth Engine
# For less complex shapefiles there are ways to pull the coordinates directly from the shapefile without Earth Engine
# Here is an example notebook: https://github.com/Google-Drought/SupportSiteTutorials/blob/a038ee1e69fff32008d8619c0acc6db082d5795d/Timeseries/OpenET_Example.Rmd
print(mapview::npts(sandy_esg))
## [1] 383672
# Visualize the allotments on a leaflet map using mapview package
mapview(sandy_esg)

Make API requests to get Rangeland Analysis Platform (RAP) fractional vegetation cover data for annual forbs and grasses (AFG), perennial forbs and grasses (PFG), shrubs (SHR), trees (TRE), and bare ground (BGR)

Fetch RAP data

Here we will iterate over a list of vegetation types to get an annual timeseries of vegetation cover for our area of interest. There are a few steps included here to clean up the data returned from the API. Requests are sent to the API in the JSON format and the resulting data is returned in the JSON format, so several lines of code convert the JSON data into an R dataframe.

Resources: - API Documentation for the endpoint used in this section is found here: https://docs.climateengine.org/docs/build/html/timeseries.html#rst-timeseries-native-custom-asset - API Documentation for the Rangeland Analysis Platform dataset is found here: https://docs.climateengine.org/docs/build/html/variables.html#rst-rap-vegetation-cover - Documentation on the Rangeland Analysis Platform is available through https://rangelands.app

Let’s start by making a request for a single RAP cover type, bare ground.

# Create list of API parameters for request
params <- list(
  dataset = 'RAP_COVER',
  variable = 'SHR',
  start_date = '1986-01-01',
  end_date = '2023-12-31',
  area_reducer = 'mean',
  asset_id = 'users/EricRJensen-DRI/RinRangelands/SandyESG')
  
# Make API request and get data from response
data <- request(base_url = paste0(root_url, 'timeseries/native/custom_asset')) |> # define URL to make request to
    req_url_query(query = !!!params) |> # define paramaters for request
    req_headers(Authorization = key) |> # set authentication key as a header for request
    req_perform() |> # perform request
    resp_body_json() 
data[1]
## [[1]]
## [[1]]$Metadata
## [[1]]$Metadata$DRI_OBJECTID
## [1] "00000000000000000000"
## 
## [[1]]$Metadata$`Statistic over region`
## [1] "mean"
## 
## 
## [[1]]$Data
## [[1]]$Data[[1]]
## [[1]]$Data[[1]]$Date
## [1] "1986-01-01"
## 
## [[1]]$Data[[1]]$`SHR (%)`
## [1] 13.4182
## 
## 
## [[1]]$Data[[2]]
## [[1]]$Data[[2]]$Date
## [1] "1987-01-01"
## 
## [[1]]$Data[[2]]$`SHR (%)`
## [1] 15.235
## 
## 
## [[1]]$Data[[3]]
## [[1]]$Data[[3]]$Date
## [1] "1988-01-01"
## 
## [[1]]$Data[[3]]$`SHR (%)`
## [1] 14.3855
## 
## 
## [[1]]$Data[[4]]
## [[1]]$Data[[4]]$Date
## [1] "1989-01-01"
## 
## [[1]]$Data[[4]]$`SHR (%)`
## [1] 10.6005
## 
## 
## [[1]]$Data[[5]]
## [[1]]$Data[[5]]$Date
## [1] "1990-01-01"
## 
## [[1]]$Data[[5]]$`SHR (%)`
## [1] 10.4389
## 
## 
## [[1]]$Data[[6]]
## [[1]]$Data[[6]]$Date
## [1] "1991-01-01"
## 
## [[1]]$Data[[6]]$`SHR (%)`
## [1] 9.7084
## 
## 
## [[1]]$Data[[7]]
## [[1]]$Data[[7]]$Date
## [1] "1992-01-01"
## 
## [[1]]$Data[[7]]$`SHR (%)`
## [1] 15.4886
## 
## 
## [[1]]$Data[[8]]
## [[1]]$Data[[8]]$Date
## [1] "1993-01-01"
## 
## [[1]]$Data[[8]]$`SHR (%)`
## [1] 15.5374
## 
## 
## [[1]]$Data[[9]]
## [[1]]$Data[[9]]$Date
## [1] "1994-01-01"
## 
## [[1]]$Data[[9]]$`SHR (%)`
## [1] 14.6257
## 
## 
## [[1]]$Data[[10]]
## [[1]]$Data[[10]]$Date
## [1] "1995-01-01"
## 
## [[1]]$Data[[10]]$`SHR (%)`
## [1] 13.2154
## 
## 
## [[1]]$Data[[11]]
## [[1]]$Data[[11]]$Date
## [1] "1996-01-01"
## 
## [[1]]$Data[[11]]$`SHR (%)`
## [1] 11.7068
## 
## 
## [[1]]$Data[[12]]
## [[1]]$Data[[12]]$Date
## [1] "1997-01-01"
## 
## [[1]]$Data[[12]]$`SHR (%)`
## [1] 13.2042
## 
## 
## [[1]]$Data[[13]]
## [[1]]$Data[[13]]$Date
## [1] "1998-01-01"
## 
## [[1]]$Data[[13]]$`SHR (%)`
## [1] 14.0702
## 
## 
## [[1]]$Data[[14]]
## [[1]]$Data[[14]]$Date
## [1] "1999-01-01"
## 
## [[1]]$Data[[14]]$`SHR (%)`
## [1] 13.6832
## 
## 
## [[1]]$Data[[15]]
## [[1]]$Data[[15]]$Date
## [1] "2000-01-01"
## 
## [[1]]$Data[[15]]$`SHR (%)`
## [1] 13.1973
## 
## 
## [[1]]$Data[[16]]
## [[1]]$Data[[16]]$Date
## [1] "2001-01-01"
## 
## [[1]]$Data[[16]]$`SHR (%)`
## [1] 12.8532
## 
## 
## [[1]]$Data[[17]]
## [[1]]$Data[[17]]$Date
## [1] "2002-01-01"
## 
## [[1]]$Data[[17]]$`SHR (%)`
## [1] 10.8196
## 
## 
## [[1]]$Data[[18]]
## [[1]]$Data[[18]]$Date
## [1] "2003-01-01"
## 
## [[1]]$Data[[18]]$`SHR (%)`
## [1] 10.507
## 
## 
## [[1]]$Data[[19]]
## [[1]]$Data[[19]]$Date
## [1] "2004-01-01"
## 
## [[1]]$Data[[19]]$`SHR (%)`
## [1] 10.8267
## 
## 
## [[1]]$Data[[20]]
## [[1]]$Data[[20]]$Date
## [1] "2005-01-01"
## 
## [[1]]$Data[[20]]$`SHR (%)`
## [1] 13.8837
## 
## 
## [[1]]$Data[[21]]
## [[1]]$Data[[21]]$Date
## [1] "2006-01-01"
## 
## [[1]]$Data[[21]]$`SHR (%)`
## [1] 11.1143
## 
## 
## [[1]]$Data[[22]]
## [[1]]$Data[[22]]$Date
## [1] "2007-01-01"
## 
## [[1]]$Data[[22]]$`SHR (%)`
## [1] 12.3571
## 
## 
## [[1]]$Data[[23]]
## [[1]]$Data[[23]]$Date
## [1] "2008-01-01"
## 
## [[1]]$Data[[23]]$`SHR (%)`
## [1] 13.5851
## 
## 
## [[1]]$Data[[24]]
## [[1]]$Data[[24]]$Date
## [1] "2009-01-01"
## 
## [[1]]$Data[[24]]$`SHR (%)`
## [1] 10.1628
## 
## 
## [[1]]$Data[[25]]
## [[1]]$Data[[25]]$Date
## [1] "2010-01-01"
## 
## [[1]]$Data[[25]]$`SHR (%)`
## [1] 11.7571
## 
## 
## [[1]]$Data[[26]]
## [[1]]$Data[[26]]$Date
## [1] "2011-01-01"
## 
## [[1]]$Data[[26]]$`SHR (%)`
## [1] 12.1842
## 
## 
## [[1]]$Data[[27]]
## [[1]]$Data[[27]]$Date
## [1] "2012-01-01"
## 
## [[1]]$Data[[27]]$`SHR (%)`
## [1] 10.2895
## 
## 
## [[1]]$Data[[28]]
## [[1]]$Data[[28]]$Date
## [1] "2013-01-01"
## 
## [[1]]$Data[[28]]$`SHR (%)`
## [1] 8.7442
## 
## 
## [[1]]$Data[[29]]
## [[1]]$Data[[29]]$Date
## [1] "2014-01-01"
## 
## [[1]]$Data[[29]]$`SHR (%)`
## [1] 10.4825
## 
## 
## [[1]]$Data[[30]]
## [[1]]$Data[[30]]$Date
## [1] "2015-01-01"
## 
## [[1]]$Data[[30]]$`SHR (%)`
## [1] 15.1299
## 
## 
## [[1]]$Data[[31]]
## [[1]]$Data[[31]]$Date
## [1] "2016-01-01"
## 
## [[1]]$Data[[31]]$`SHR (%)`
## [1] 15.9494
## 
## 
## [[1]]$Data[[32]]
## [[1]]$Data[[32]]$Date
## [1] "2017-01-01"
## 
## [[1]]$Data[[32]]$`SHR (%)`
## [1] 16.0865
## 
## 
## [[1]]$Data[[33]]
## [[1]]$Data[[33]]$Date
## [1] "2018-01-01"
## 
## [[1]]$Data[[33]]$`SHR (%)`
## [1] 16.1235
## 
## 
## [[1]]$Data[[34]]
## [[1]]$Data[[34]]$Date
## [1] "2019-01-01"
## 
## [[1]]$Data[[34]]$`SHR (%)`
## [1] 16.4563
## 
## 
## [[1]]$Data[[35]]
## [[1]]$Data[[35]]$Date
## [1] "2020-01-01"
## 
## [[1]]$Data[[35]]$`SHR (%)`
## [1] 17.9646
## 
## 
## [[1]]$Data[[36]]
## [[1]]$Data[[36]]$Date
## [1] "2021-01-01"
## 
## [[1]]$Data[[36]]$`SHR (%)`
## [1] 15.9563
## 
## 
## [[1]]$Data[[37]]
## [[1]]$Data[[37]]$Date
## [1] "2022-01-01"
## 
## [[1]]$Data[[37]]$`SHR (%)`
## [1] 13.7689
## 
## 
## [[1]]$Data[[38]]
## [[1]]$Data[[38]]$Date
## [1] "2023-01-01"
## 
## [[1]]$Data[[38]]$`SHR (%)`
## [1] 15.9563

Now that we have a handle on how to make a single request, let’s loop over the five RAP Cover variables to get all of the data.

# Make a list of RAP variable options — we will loop over these using the function below to return all of the data
variables <- c('AFG', 'BGR', 'PFG', 'SHR', 'TRE') 
 
# Loop over the RAP Cover variables to get all RAP Cover variables from the API
rap_df <- tibble()
for (variable in variables){ 
  
  print(paste0("Requesting data for ", variable))
  
  # Create list of API parameters for request
  params <- list(
    dataset = 'RAP_COVER',
    variable = variable,
    start_date = '1986-01-01',
    end_date = '2023-12-31',
    area_reducer = 'mean',
    asset_id = 'users/EricRJensen-DRI/RinRangelands/SandyESG')
  
  # Make API request and get data from response
  data <- request(base_url = paste0(root_url, 'timeseries/native/custom_asset')) |> # define URL to make request to
    req_url_query(query = !!!params) |> # define paramaters for request
    req_headers(Authorization = key) |> # set authentication key as a header for request
    req_perform() |> # perform request
    resp_body_json() |> # parse result into JSON
    pluck(1, 'Data') # Get RAP data from the JSON
  
  # Convert JSON to dataframe
  # For each date in data response, convert to one row dataframe and then bind them together
  # Here I'm using the Tidyverse version of iterating, the purrr package's map function. You can use this function similarly to for loops for cleaner code.
  generate_data_frame <- function(item){
    
    return(tibble(Date = as.Date(item$Date),
                  Variable = as.factor(names(item[2]) |> str_sub(1,4)),
                  Value = item[[2]])) 
    }
  df <- purrr::map(data, generate_data_frame) |> 
    bind_rows()
  
  # Combine the data returned from the API for all of the RAP Cover variables
  rap_df <- bind_rows(rap_df, df)
}
## [1] "Requesting data for AFG"
## [1] "Requesting data for BGR"
## [1] "Requesting data for PFG"
## [1] "Requesting data for SHR"
## [1] "Requesting data for TRE"
print(rap_df)
## # A tibble: 190 × 3
##    Date       Variable Value
##    <date>     <fct>    <dbl>
##  1 1986-01-01 "AFG "   4.44 
##  2 1987-01-01 "AFG "   2.66 
##  3 1988-01-01 "AFG "   1.51 
##  4 1989-01-01 "AFG "   1.23 
##  5 1990-01-01 "AFG "   1.52 
##  6 1991-01-01 "AFG "   2.46 
##  7 1992-01-01 "AFG "   3.44 
##  8 1993-01-01 "AFG "   1.65 
##  9 1994-01-01 "AFG "   0.645
## 10 1995-01-01 "AFG "   1.63 
## # ℹ 180 more rows

Visualize the RAP data as timeseries similar for each vegetation type

Make simple plots of the RAP time-series data for the Sandy ESG, depending on the user selection. The plot produces is similar to the one produced in the RAP web application.

Note: We could also visualize the data as a stacked bar chart, similar to with the drought data below, but decided to visualize them as lines do do something different.

# Make a simple plot of the time-series data
rap_cover <- ggplot(data = rap_df)+
  geom_line(mapping = aes(x = Date, y = Value, colour = Variable), linewidth = 2)+
  labs(title = 'Rangeland Analysis Platform (RAP) Vegetation Cover for Sandy ESG', y = 'Fractional cover (%)') +
  scale_colour_manual(values = c("#BF604A", "#BF8C4A", "#A5C737", "#3757AD", "#0A6E1C"), 
                      labels = c("AFG", "BGR", "PFG", "SHR", "TRE"))+
  theme_classic()+
  scale_y_continuous(expand = c(0,0))+
  theme(legend.position="bottom",
        axis.title.x = element_blank(),
        legend.title=element_blank())
rap_cover

# add a line for 2015, since the rest of our benchmark comparison focused on 2015-2022 
rap_cover<- rap_cover+ 
  geom_vline(xintercept = as.Date("2015-01-01"), linetype = "dashed", 
             color = "black", size = 1)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
rap_cover

Challenge

For each year in the ROI, Calculate total RAP cover of vegetation (AFG, PFG, SHR, TRE). Then apply our wind erosion benchmark year: determine the number of years that the total vegetation cover is 70% or higher (meeting the benchmark), or lower (not meeting)

  • Hint, you might need to use filters, groups and summarize/mutate functions to calculate vegetation cover.

Answer: Between 1986 - 2022, annual average cover in the SandyESG was at least 30% total vegetation cover and was therefore meeting out wind erosion benchmark in 20 years. 17 years were not meeting the benchmark.

Make API request to get Standardized Precipitation-Evapotranspiration Index (SPEI) data from gridMET Drought.

Fetch 180-day SPEI data from gridMET Drought.

Here, we will get histograms of 180-day SPEI drought data based on accepted drought categories developed by NOAA NIDIS (drought.gov). This will allow us to make a beautiful stacked bar chart to depict the drought history for our AOI. You can learn more about drought indicators and how the Climate Engine team is helping the BLM to assess drought on our support site: https://support.climateengine.org/article/128-rangeland-drought-assessment

In this section we’ll be using the # Documentation for this endpoint found here ‘zonal_stats/pixel_count/custom_asset’ API endpoint. More information: https://docs.climateengine.org/docs/build/html/zonal_statistics.html#rst-zonal-stats-pixel-count-custom-asset

NOTE: I’ve pre-run the fetching of the histogram data since it can be time-consuming to run live. Skip ahead to the next section to read-in CSVs of the drought data to produce the stacked bar chart.

# # Generate a date list to extract drought statistics over
# start_year = 2015
# end_year = 2023
# yearlist = seq(from = start_year, to = end_year, by = 1)
# 
# # Loop over the yearlist to generate a list of dates for each year from 2015-2023 
# dates = list()
# for (year in yearlist){
#   dates_year <- seq(from = as.Date(paste0(as.character(year), "/1/5")), to = as.Date(paste0(as.character(year), "/12/31")), by = '5 days')
#   dates <- c(dates, as.character(dates_year))
#   }
# 
# # Documentation on the gridMET Drought is available here: https://support.climateengine.org/article/45-gridmet-drought
# # API Documentation on the gridMET Drought: https://docs.climateengine.org/docs/build/html/variables.html#rst-gridmet-drought
# # Generate dataframe of histograms
# get_drought_histogram <- function(date){
# 
#   print(paste0('Running ', date))
# 
#   # Create list of API parameters for request
#   params = list(
#     dataset = 'GRIDMET_DROUGHT',
#     variable = 'spei180d',
#     end_date = as.character(date),
#     area_reducer = 'mean',
#     asset_id = 'users/EricRJensen-DRI/RinRangelands/SandyESG',
#     bins = '[-2, -1.5, -1.2,-0.7, -0.5, 0.5, 0.7, 1.2, 1.5, 2]')
# 
#   # Make API request and get data from response
#   data <- request(base_url = paste0(root_url, 'zonal_stats/pixel_count/custom_asset')) |>
#     req_url_query(query = !!!params) |>
#     req_headers(Authorization = key) |>
#     req_perform() |>
#     resp_body_json() |>
#     pluck(1)
# 
#   # Iterate over the SPEI histogram bins to produce dataframe of SPEI histograms
#   bins = names(data$spei180d)
#   row_df <- tibble('Date' = date)
#   for (bin in bins){
#     val = data$spei180d[bin][1] |>
#       unlist() |>
#       unname()
#     df = tibble(bin = val)
#     names(df) <- bin
#     row_df = bind_cols(row_df, df)}
# 
#   return(row_df) }
# 
# spei_df <- purrr::map(dates[1:4], get_drought_histogram) |>
#   bind_rows() %>%
#   mutate(across(everything(), replace_na, 0))

# Optionally write out csv
# write_csv(spei_df, paste0('data/sandy_esg, '_', variable, '_', as.character(start_year), '_', as.character(end_year), '.csv'))

Create a stacked barchart of the 180-day SPEI drought categories for the AOI

In the code below we will prepare the dataframe of drought histograms that we produced in the last section for visualization. First, we will pivot the dataframe from wide to long; then, we will calculate the percentage of the area of interest that is in each drought category for each date; last, we will order the categories in the correct order to match the figure referenced at the beginning of this case study.

Preparing the 180-day SPEI data

# Read in the 180-day SPEI data CSV
spei_df <- read.csv('data/sandy_esg_spei180d_2015_2023.csv')
names(spei_df) <- c("Date", "1",    "2",    "3",    "4",    "5",    "0",    "-1",   "-2",   "-3",   "-4",   "-5",   "null")
# Clean and prepare the data for plotting the stacked bar chart
spei_df_plot <- spei_df |> 
  select(-'null') |> # remove null column resulting from gridMET drought data issue
  mutate(Date = lubridate::as_date(Date)) |> # The date needs to be a date object for the plot
  pivot_longer(!Date, names_to = 'Category', values_to = 'Count') |> # pivot from wide to long dataframe for plotting
  group_by(Date) |>
  mutate(Pixels = sum(Count)) |> # calculate sum of pixels for each date for generating histogram
  ungroup() |>
  mutate(Category = factor(Category, levels = c('0', '-1', '-2', '-3', '-4', '-5', '1', '2', '3', '4', '5'))) |> # convert Drought Category to a factor for plotting
  mutate(Percent = (Count/Pixels) * 100) |> # calculate percent of pixels in each class for generating histogram
  filter(Category %in% c('0', '-1', '-2', '-3', '-4', '-5')) # filter for drought categories
spei_df_plot
## # A tibble: 3,942 × 5
##    Date       Category Count Pixels Percent
##    <date>     <fct>    <int>  <int>   <dbl>
##  1 2015-01-05 0            0    590       0
##  2 2015-01-05 -1           0    590       0
##  3 2015-01-05 -2           0    590       0
##  4 2015-01-05 -3           0    590       0
##  5 2015-01-05 -4           0    590       0
##  6 2015-01-05 -5           0    590       0
##  7 2015-01-10 0            0    590       0
##  8 2015-01-10 -1           0    590       0
##  9 2015-01-10 -2           0    590       0
## 10 2015-01-10 -3           0    590       0
## # ℹ 3,932 more rows

Plotting the 180-day SPEI data in ggplot2

# make a figure to display SPEI in the sandy ESG
spei_plot <- ggplot(spei_df_plot, mapping = aes(x = Date, y = Percent, fill = Category))+
            geom_bar(stat = 'identity', width = 10)+
             scale_fill_manual(values = c('0' = '#FFFFFF', '-1' = '#FEFE33', '-2' = '#FFD580', '-3' = '#FFA500', '-4' = '#DC143C', '-5' = '#8C000F'), # define drought category colors according to NDMC: https://droughtmonitor.unl.edu/About/AbouttheData/DroughtClassification.aspx
                              labels = c('No drought', 'Abnormally\ndry', 'Moderate', 'Severe', 'Extreme', 'Exceptional'))+ # define drought category names according to NDMC: https://droughtmonitor.unl.edu/About/AbouttheData/DroughtClassification.aspx
            scale_y_continuous(breaks=c(0, 20, 40, 60, 80, 100), expand = c(0, 0)) +
  scale_x_date(expand = c(0, 0)) +  # the expand gets rid of gaps between x and y axis lines and the data 
            labs(title = '180-day SPEI Timeseries Plot for Sandy ESG', x = '', y = 'Percent', fill = 'Drought\nCategory')+
            theme_classic()+
            theme(legend.key=element_rect(colour="grey"))

spei_plot

Create summary tables of 180-day SPEI data

Often, you’ll want to provide a summary of drought or vegetation data for a publication, report, or presentation. Here, we’ll take the same SPEI data we exported to create tables of yearly maximum and mean drought extent across the Sandy ESG area.

First, as always, we need to prepare the data

# Clean and prepare data for summary table
spei_df_table <- spei_df |>
  mutate(Pixels = rowSums(across(where(is.numeric)))) |> # calculate total pixels in each row
  select('Date', 'D0' = '-1', , 'D1' = '-2', 'D2' = '-3', , 'D3' = '-4', 'D4' = '-5', 'Pixels') |> # remove non-drought columns
  mutate(D0 = ((D0 + D1 + D2 + D3 + D4) / Pixels) * 100,
         D1 = ((D1 + D2 + D3 + D4) / Pixels) * 100,
         D2 = ((D2 + D3 + D4) / Pixels) * 100,
         D3 = ((D3 + D4) / Pixels) * 100,
         D4 = ((D4) / Pixels) * 100) |> # calculate new columns for the total area in class or in more severe class as percentage
  mutate(Year = as.numeric(stringr::str_extract(Date, "^\\d{4}"))) |> # create new column for year to group_by
  group_by(Year) # calculate maximum and average areas in drought by year for each category

spei_df_table
## # A tibble: 657 × 8
## # Groups:   Year [9]
##    Date          D0    D1    D2    D3    D4 Pixels  Year
##    <chr>      <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>
##  1 2015-01-05     0     0     0     0     0    590  2015
##  2 2015-01-10     0     0     0     0     0    590  2015
##  3 2015-01-15     0     0     0     0     0    590  2015
##  4 2015-01-20     0     0     0     0     0    590  2015
##  5 2015-01-25     0     0     0     0     0    590  2015
##  6 2015-01-30     0     0     0     0     0    590  2015
##  7 2015-02-04     0     0     0     0     0    590  2015
##  8 2015-02-09     0     0     0     0     0    590  2015
##  9 2015-02-14     0     0     0     0     0    590  2015
## 10 2015-02-19     0     0     0     0     0    590  2015
## # ℹ 647 more rows

Now, we can use another of the summarize functions to produce clean tables of yearly mean and maximum drought extent across the Sandy ESG

# Get average area in each drought category (or greater) during each year and round to two decimals
spei_df_mean_table <- summarize_at(spei_df_table, c('D0', 'D1', 'D2', 'D3', 'D4'), ~round(mean(.), 2))
spei_df_mean_table
## # A tibble: 9 × 6
##    Year    D0    D1    D2    D3    D4
##   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1  2015  0.05  0     0     0     0   
## 2  2016 13.7   7.99  0.55  0     0   
## 3  2017  4.56  1.97  0.09  0     0   
## 4  2018 59.7  46.2   8.41  1.33  0.3 
## 5  2019 16.9  11.8   1.42  0     0   
## 6  2020 39.0  37    31.5  18.6   9.54
## 7  2021 48.8  46.4  30.6  19.7  14.2 
## 8  2022 53.5  46.9  17.5   5.39  0.79
## 9  2023 54.3  48.5  40.1  22.6  12.7
# Get maximum area in each drought category (or greater) during each year and round to two decimals
spei_df_max_table <- summarize_at(spei_df_table, c('D0', 'D1', 'D2', 'D3', 'D4'), ~round(max(.), 2))
spei_df_max_table
## # A tibble: 9 × 6
##    Year     D0     D1     D2    D3    D4
##   <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl>
## 1  2015   2.71   0.17   0      0    0   
## 2  2016  86.3   79.0   16.1    0    0   
## 3  2017  34.1   20.8    2.71   0    0   
## 4  2018  99.2   92.7   39.7   14.9  9.32
## 5  2019  71.0   58.0   17.3    0    0   
## 6  2020 100    100     99.2   93.6 59.3 
## 7  2021 100    100    100     99.3 97.3 
## 8  2022 100    100     72.4   43.0 12.0 
## 9  2023 100    100     99.8   96.4 75.2

wrap it all together

make a combined figure!

In R, we can also combine multiple ggplot objects to make a combined figure. recall the main figures we have created thus far

us_state_map

zoomed_state_map

study_with_bench_eval

benchmark_boxplot

benchmark_barchart

rap_cover

spei_plot
## Warning: Removed 30 rows containing missing values (`position_stack()`).
## Warning: `position_stack()` requires non-overlapping x intervals

Combine four of these figures into one combined figure using ggarrange from ggpubr

combined_benchmark_fig <- 
  ggarrange(study_with_bench_eval, benchmark_barchart, 
            rap_cover, spei_plot, # specify the figures to include
            nrow = 2, ncol = 2, # arrangement of combined figures
            labels = "AUTO") # adds captiol letters for each graph panel
## Warning: Removed 30 rows containing missing values (`position_stack()`).
## Warning: `position_stack()` requires non-overlapping x intervals
combined_benchmark_fig # looks smushed but that is because it is so big!! save it and look our output

ggsave("benchmark_combined_figure.png", height = 9, width = 13)